Linguistic database summaries and their protoforms: towards natural language based knowledge discovery tools

نویسندگان

  • Janusz Kacprzyk
  • Slawomir Zadrozny
چکیده

We consider linguistic data(base) summaries in the sense of Yager [Information Sciences 28 (1982) 69–86], exemplified by ‘‘most employees are young and well paid’’ (with some degree of truth added), for a personnel database, as an intuitive, human consistent and natural language based knowledge discovery tool. We present first an extension of the classic Yager s approach to involve more sophisticated criteria of goodness, search methods, etc. We advocate the use of the concept of a protoform (prototypical form), that is recently vividly advocated by Zadeh [A prototype-centered approach to adding deduction capabilities to search engines—the concept of a protoform. BISC Seminar, University of California, Berkeley, 2002], as a general form of a linguistic data summary. We present an extension of our interactive approach, based on fuzzy logic and fuzzy database queries, which makes it possible to implement such linguistic data summaries. We show how fuzzy queries are related to linguistic summaries, and show that one can introduce a hierarchy of protoforms, or abstract summaries in the sense of latest 0020-0255/$ see front matter 2005 Published by Elsevier Inc. doi:10.1016/j.ins.2005.03.002 * Corresponding author at: Systems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01-447 Warsaw, Poland. Tel.: +48 22 836 44 14 01 447; fax: +48 22 837 27 72. E-mail addresses: [email protected] (J. Kacprzyk), [email protected] (S. Zadro_ zny). 282 J. Kacprzyk, S. Zadro_ zny / Information Sciences 173 (2005) 281–304 Zadeh s [A prototype-centered approach to adding deduction capabilities to search engines—the concept of a protoform. BISC Seminar, University of California, Berkeley, 2002] ideas meant mainly for increasing deduction capabilities of search engines. For illustration we show an implementation for a sales database in a computer retailer, employing some type of a protoform of a linguistic summary. 2005 Published by Elsevier Inc.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Protoforms of Linguistic Database Summaries as a Human Consistent Tool for Using Natural Language in Data Mining

We consider linguistic database summaries in the sense of Yager (1982), in an implementable form proposed by Kacprzyk & Yager (2001) and Kacprzyk, Yager & Zadrożny (2000), exemplified by, for a personnel database, “most employees are young and well paid” (with some degree of truth) and their extensions as a very general tool for a human consistent summarization of large data sets. We advocate t...

متن کامل

Towards human consistent data driven decision support systems using verbalization of data mining results via linguistic data summaries

We present how the conceptually and numerically simple concept of a fuzzy linguistic database summary can be a very powerful tool for gaining much insight into the essence of data that may be relevant for a business activity. The use of linguistic summaries provides tools for the verbalization of data analysis (mining) results which, in addition to the more commonly used visualization e.g. via ...

متن کامل

A distance metric for a space of linguistic summaries

Producing linguistic summaries of large databases or temporal sequences of measurements is an endeavor that is receiving increasing attention. These summaries can be used in a continuous monitoring situation, like eldercare, where it is important to ascertain if the current summaries represent an abnormal condition. It is therefore necessary to compute the distance between summaries as a basis ...

متن کامل

On Multi-subjectivity in Linguistic Summarization of Relational Databases

We focus on one of the most powerful computing methods for natural-language-driven representation of data, i.e. on Yager’s concept of a linguistic summary of a relational database (1982). In particular, we introduce an original extension of that concept: new forms of linguistic summaries. The new forms are named Multi-Subject linguistic summaries, because they are constructed to handle more tha...

متن کامل

Using OLAP and Data Mining for Content Planning in Natural Language Generation

We present a new approach to content determination and content organization in the context of natural language generation for quantitative database summaries. Three key properties make our work innovative and interesting: (1) we developed a new text planning approach to deals with the content organization of a data set into a summary report, for example a Data Mining discovery; (2) the approach...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Sci.

دوره 173  شماره 

صفحات  -

تاریخ انتشار 2005